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Abstract —Polar codes under cyclic redundancy check aided 
successive cancellation list (CA-SCL) decoding can outperform 
the turbo codes and the LDPC codes when code lengths are 
configured to be several kilobits. In order to reduce the decoding 
complexity, a novel tree-pruning scheme for the SCL/CA-SCL 
decoding algorithms is proposed in this paper. In each step of 
the decoding procedure, the candidate paths with metrics less 
than a threshold are dropped directly to avoid the unnecessary 
computations for the path searching on the descendant branches 
of them. Given a candidate path, an upper bound of the path 
metric of its descendants is proposed to determined whether the 
pruning of this candidate path would affect frame error rate 
(FER) performance. By utilizing this upper bounding technique 
and introducing a dynamic threshold, the proposed scheme 
deletes the redundant candidate paths as many as possible while 
keeping the performance deterioration in a tolerant region, thus 
it is much more efficient than the existing pruning scheme. With 
only a negligible loss of FER performance, the computational 
complexity of the proposed pruned decoding scheme is only about 
40% of the standard algorithm in the low signal-to-noise ratio 
(SNR) region (where the FER under CA-SCL decoding is about 
0.1 ~ 0.001), and it can be very close to that of the successive 
cancellation (SC) decoder in the moderate and high SNR regions. 

Index Terms —Polar codes, successive cancellation decoding, 
tree-pruning. 

I. Introduction 

P OLAR codes have been proven to achieve the symmetric 
capacity on binary-input discrete memoryless channels 
under a low-complexity successive cancellation (SC) decoding 
algorithm 03 ■ Although the polar codes asymptotically achieve 
the channel capacity, the performance under SC decoding is 
unsatisfying when the code length is of the order of kilobits. 
Several alternative decoding schemes have been proposed to 
improve the finite-length performance of polar codes, such as 
successive cancellation list (SCL) J2), successive cancellation 
stack (SCS)B) and belief propagation (BP) |4] decoding 
algorithms. It is reported that polar codes under the CRC-aided 
SCL/SCS (CA-SCL/SCS) decoding algorithms can achieve a 
better frame error rate (FER) performance than the LDPC and 
turbo codes when the code lengths are configured to several 
kilobits mmm. Therefore, polar coding is believed to be a 
competitive candidate in future communication systems. 

Since the CA-SCS decoding requires a large stack to store 
the candidate paths which leads to a high space complexity, 
CA-SCL decoding algorithm is of more interest nail Da. 
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Nevertheless, to achieve competitive performance against 
LDPC or turbo codes, a moderate-sized list is required in CA- 
SCL decoding. In that case, the computational complexity of 
the CA-SCL decoder is still high. 

As stated in HD, SCL decoding can be regarded as a 
path searching procedure on the code tree. In order to reduce 
the complexity of SCL decoding, tree-pruning technique is 
exploited by avoiding unnecessary path searching operations 
ltl2ft . In order to keep the FER loss in an acceptable region, 
D3 computes the pruning threshold in a very conservative 
way. Only the candidate paths with metrics much less than 
the maximum one are pmned. It works well when the signal- 
to-noise ratio (SNR) is high, where the metric of the correct 
path is usually much larger than the others. However, this 
existing pruning technique is no longer efficient when working 
in the relative low SNR region where the FER under CA-SCL 
decoding is about 0.1 ~ 0.001, while it is exactly the operating 
regime for cellular networks. 

In this paper, we propose to compute the threshold using 
the sum of the survival path metrics. To evaluate how much 
a pruned candidate path would affect FER performance, we 
propose a metric upper bound of its descendants. Utilizing 
this upper bounding technique, a dynamic threshold is further 
proposed. The proposed scheme deletes the redundant candi¬ 
date paths as many as possible while keeping the performance 
deterioration in a tolerant region, thus it is much more efficient 
than the existing pruning scheme. 

The remainder of the paper is organized as follows. 
Section [II] reviews the basics of polar coding. Section m 
describes the proposed tree-pruning scheme for SCL decoding. 
A path metric upper bound of the descendants of some 
given candidate path and a dynamic threshold configuration 
method are proposed. Section m provides the performance 
and complexity analysis based on the simulation results. 
Finally, Section [V] concludes the work. 

II. Preliminaries 
A. Notation Convention 

In this paper, we use calligraphic characters, such as X and 
y, to denote sets, and \X\ to denote the number of elements in 
X. We write the Cartesian product of X and y as X x y, and 
write the n-th Cartesian power of X as X" . Further, we write 
y\X to denote the subset of y with elements in X excluded. 

We use notation to denote a A - -dimension 

vector (iq, U 2 , • • • , Ujv) and vj to denote a subvector 
(vi,Vi+ 1 ,--- ,Vj i,Vj) of v±, 1 < i,j < N. Particularly 
when i > j, v : [ is a vector with no elements in it and the 
empty vector is denoted by (j). We write v^ a to denote the 
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subvector of v 3 with odd indices (ak : 1 < k < N\ k is 
odd). Similarly, we write to denote the subvector of t;j v 
with even indices (a*, : 1 < k < N\ k is even). For example, 
for vf, = (v 2 ,v 3 ), vf a = {vi,v 3 ) and v\ e = ( v 2 ,v 4 ). 
Further, given a index set A, va denote the subvector of t; j v 
which consists of ViS with i £ A. 


B. Polar Coding and SC Decoding 

We are given a binary-input memoryless channel W : X —> 
y with input alphabet X = {0,1} and output alphabet y, the 
channel transition probabilities are W (y\x), x £ X, y £ y. 

For code length N = 2 n , n = 1,2, • - •, and information 
length A', i.e. code rate R = K/N, polar coding over W 
proposed by Arikan can be described as follows: 

After channel combining and splitting operations on N in¬ 
dependent uses of W, we get N successive uses of synthesized 
binary input channels IFy , i = 1,2, ,N, with transition 

probabilities 

Wji\y?,u[- 1 \u i )= Y, WivfofK) (1) 

uP +1 ex N ~* 

where 

N 

W N (y?K) = nW {yi \ Xi ) (2) 

i =1 

and the source block Ui are supposed to be uniformly 
distributed in {0,1}^. 

The reliabilities of the polarized channels j VF^' 1 | can be 
evaluated by using density evolution El. and is usually more 
evaluated efficiently by calculating Bhattacharyya parameters 
m for binary erasure channels (BECs) or by using Gaussian 
approximation m for binary-input AWGN (BIAWGN) chan¬ 
nels. 

To transmit a message block of K bits, the K most reliable 
polarized channels with indices i £ A are picked 

out for carrying these information bits; a fixed bit sequence 
called frozen bits are transmitted over the others. The index set 
A C {1,2, • • • , N} is called the information set and |.4| = K , 
and its complement set which is denoted by A c is called the 
frozen set. 

As mentioned in 12, polar codes can be decoded using 
successive cancellation (SC) decoding algorithm. In fTTl . it is 
further described as a path searching procedure on a decoding 
tree. The metric of a decoding path u\ can be measured using 
a posteriori probability 


P { N ] (u[\y?) 


0 

w^far.urv.) 

2 P(y?) 


if i £ A c and in ^ Ui 

otherwise 


(3) 

When Ui is not a wrong frozen bit, the above path metric can 
be recursively computed as 


p(2i—i) / - 2z —1 
r 2N l“l 


vi N ) 



*2ie{o,i} 



where n > 0, N = 2 n , 1 < i < N. 

Thus, SC decoding can be described as a greedy search 
algorithm on the code tree. In each level, only the one of 
two descendants with larger path metric is selected for further 
expansion. 



( 6 ) 


where 


hi (y?,u I" 1 ) = |° if > P N (“II Vi) (7) 


1 otherwise 


C. Improved SC Decoding Algorithms 

The performance of SC is limited by the bit-by-bit decoding 
strategy. Whenever a bit is wrongly determined, there is no 
chance to correct it in the rest of the decoding procedure. 

Theoretically, the performance of the maximum a posteriori 
probability (MAP) decoding (or equivalently ML decoding, 
since the inputs are assumed to be uniformly distributed) can 
be achieved by traversing all the A -length decoding paths in 
the code tree. But this brute-force search takes exponential 
complexity and is impossible to be implemented for practical 
code lengths. 

Two improved decoding algorithms, SCL decoding and SCS 
decoding, are proposed in 12 and 0. Both of these two 
algorithms allow more than one edge to be explored in each 
level of the code tree. During the SCL(SCS) decoding, a set 
of candidate paths are obtained and stored in a list(stack). 
Combining the ideas of SCL and SCS, a decoding algorithm 
named successive cancellation hybrid (SCH) is proposed in 
m. which can achieve a better trade-off between computa¬ 
tional complexity and space complexity. Moreover, with the 
help of CRC codes, polar codes decoded by these improved 
SC decoding algorithms are found to be capable of achieving 
the same or even better performance than turbo codes or LDPC 
codes 0 0 0. 

Among these existing improved SC decoding algorithms, 
benefitting from the limited requirement for the memory, 
(CA-)SCL decoding is the most interesting for hardware 
implementation 0 0 lH5l lfl6l . As shown in Fig. |T} the 
processing loop of the standard SCL/CA-SCL decoding is as 
follows: 

51) For each candidate path, calculate the path metrics of its 
descendant paths; 

52) Sort the metrics, and reserve at most L paths with the 
larger metrics and delete the others; 

53) If any two of the survival paths share the same parent 
node, then a copy operation is performed to create 
separate working spaces for these two paths; 

54) For each survival path, update the partial-sum recur¬ 
sively; 

55) The above loop is processed until the length of candidate 
paths reach N. The candidate path with the largest 
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Fig. 1. The flow chart of the pruned SCL/CA-SCL decoding. 


path metric (when CRC embedded, the candidates which 
cannot pass CRC are dropped directly) is picked out for 
the final decision. 


III. Tree-Pruning Scheme for SCL/CA-SCL 
Decoding Algorithm 

A. The Proposed Pruning Scheme 

In order to reduce the computational complexity of SCL 
decoding, a pruning operation is added after the sorting 
operation (as shown in FigJTJ. If the metric of some candidate 
path is less than a threshold, it will be directly deleted to avoid 
redundant path expansions and copy operations. 

In this paper, we propose to use the path metric sum of 
the (maximum) L survival candidate paths after sorting: while 
decoding the z-th bit, the metrics of the survival paths is 
where j £ £* is index set of the survival paths in 
the list after sorting operation, 1 < |A| < L ; If the following 
inequality holds for some j £ C t , the corresponding path is 
then deleted, 

f (8) 

k=l 

where 0 < a t < 1. Particularly, if cti = 0, then no pruning is 
performed when decoding this i-th bit. In the following part 
of this section, we’ll discuss how to choose the value of {ckj}. 

1) Performance Deterioration: Suppose that the correct 
path is still in the list after the sorting operation during 
decoding the i-th bit. The probability of that the j-\h candidate 
is the correct path (i.e., the performance loss of deleting this 
path) is computed as 


:>(*) 


p(») 

zLfc=l r k 



Fig. 2. The probability density function of LLR. 


2 ) Statistical Threshold Configuration: Given a specific po¬ 
lar code, the channel property, and a tolerant FER performance 
loss P to i, the most direct way to configure cu is through Mote 
Carlo simulation. 

Initially, set ctj = 1 and simulate using standard (CA-)SCL 
decoding. During decoding the z-th bit in each frame, the ratio 
of the metric of the correct path (until the z-th bit) Pc' 1 and 
the sum metric of the survival paths in the list is recorded; If 
the final decoding result is correct and the ratio is less than 
on, then update ctj with this ratio, i.e., 

= mm p (i) (10 > 

\ i At / 

When the amount of simulated frame is large enough, the 
pruning operation based on (|8j, the FER performance loss 
can be very small. 


B. Dynamic Threshold Configuration 

The Monte Carlo configuration is dependant on the specific 
SNR, code length, and code rate. Thus, it’s quite difficult to 
use for practical application. For polar codes, the reliability 
of the polarized channels can be evaluated using Gaussian 
approximation liT4l or some other techniques; in other words, 
the probability density functions (PDFs) of the LLRs which 
corresponding to the receiving bits (conditioned on that the 
previous bits are correctly decoded) can be a priori information 
to the decoder. In this subsection, we present a method to 
estimate the performance loss brought by pruning using these 
LLR distributions; and then, a dynamic threshold configuration 
method is proposed. Using the proposed thresholding method, 
the pruned (CA-)SCL decoding can fully utilize the tolerant 
performance deterioration and thus lower the computational 
complexity. 

1) Path Metric Upper Bounds: The LLR PDFs can be 
obtained by using density evolution or Gaussian approxima¬ 
tion ns. Based on the PDF corresponding to a bit Ui, we 
can define a LLR region, such that the probability of the 
corresponding LLR takes values in [—Zj,Zj] is larger than a 
pre-defined small probability P\\ r , 

(i» 

Therefore, when decoding the z-th bit, if one candidate path 
has metric pW, the metric of any its descendant path at 
the j-th level has an upper bound, 


p(j) _ 

■*ub 


pw 


II ( 


k=i -\-1 


1 + e li 


) > P U) 


Pr{j-th path is the correct one} 


(9) 


( 12 ) 
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(1024,512) Eb/NO = 1.5dB 



Fig. 3. The upper bound of path metric. 


pruned path during decoding the first ? bits, i.e., u\. For each 
pruned path k £ Si, the level index tk when it is pruned, 
along with the corresponding path metric pk and the estimated 
performance loss qk which is computed using (|9), is recorded. 
Obviously, tk < ?. Based on (tk,Pk,Qk), the maximum metric 
value at the ?-th level of the descendants of the pruned path 
k can be computed using in (12 1 , 

7W _ p(**=) 1 pM 

Z k ~Pk~ 0,b + Gjb 


(13) 


The performance loss P de which is brought by the pruning 
operations during decoding the first i bits is evaluated as 
follows: 

SI) Find the survival paths in the list £. ( which are with 
metrics larger than the maximum zf. 


£i = \j 


j £ Ci,P\ l) > max zf 
3 keSi K 


Q C'i 04 ) 



Survival Paths 
Pruned Paths 
Descendants of Pruned 
Paths which would survive 
Descendants of Primed Paths 
which wouldn't survive 


Fig. 4. A graphic demonstration of the pruned paths which would survive. 


the number of these found paths is \C' i [, 

52) Find (L — |£'|) pruned records with indices C S, 
which has the larger estimated performance losses, i.e., 
for any k £ S' and k! £ S\S[, we have qk > qv, where 
\S'\=L-\£'\. 

53) The performance loss if' 1 is upper bounded by 

< E 9k 05) 

fees' 


where 1 < i < j < N. 

Note that for bit index k £ [i,j], every bit effects the value 
of pff to some extent, no matter it’s an information bit or a 
frozen bit. Specifically, for an information bit with relatively 
high reliability, i.e., with a large Z,, its impact on Pf 1 is 
considered negligible; for a frozen bit, since the value of l t 
is relatively smaller, its impact on Pf* is more significant. 

Fig.[3]gives the simulation result of a (1024, 512) polar code 
under BIAWGNC with SNR 1.5dB. The decoding algorithm 
is CA-SCL with L = 32. The maximum and average values 
of the path metric during decoding each bit are recoded. To 


guarantee the inequality ( [T2 ) i holds with probability larger than 


1 — 10 9 , we set P Ur = 


io~ 

N 


As shown in the figure, the 


simulation data is well bounded by (12 1 . 

2) Threshold Computation: In this subsection, we propose 
a new threshold computation method which can fully utilize 
the pre-defined tolerant FER performance loss P U) \. 

As previously stated, pruning operation during decoding Ui 
will cause some FER performance loss; When expansion at 
level -(i + 1) on the code tree, the loss brought by the pruned 
path at level-?' is accumulated, i.e., the paths which cause 
performance loss during decoding itj+i include not only the 
newly pruned paths but also the descendants of the pruned 
paths at level-?. Thus, when decoding at level-? on the code 
tree, the FER performance loss is computed based on both the 
newly pruned paths at level-? and the descendants of all the 
previously pruned paths which would be in the list. A graphic 
illustration is given in Fig. [4] 

To estimate the FER loss brought by the pruning operations, 
the pruned path should be recorded. Let Si be the active 


The threshold a, is determined by the tolerant performance 
loss P tol and the loss introduced in the previous decoding 

r)(i — 1 ) 

process P^ e 1 , 


OLo — 


E 


pi* 

jeCi\Ri r j 


E 


jeCi 


i) 


(16) 


where index set IZi indicates the candidates to be pruned and 
is the largest subset of £? which satisfies 


E p( f } ^ ( f ‘oi - p 


(i-i) 

de 




(17) 


jeUi 


jeCi 


After the pruning, the set of pruned records S, is updated 
as follows: 


51) Combing Si -1 and the newly pruned paths which are 
induced by 'JZ t , the obtained temporary index set is 
denoted by 7i; 

52) Find the L pruned records with largest losses in 7), the 
result indices form the set T(, i.e., T- C 7), for any 
k £ 77 and k' £ Ti\Ti, we have q k > qk'- 

53) The minimum value of the metric upper bounds of the 
pruned records in f is 

-Zmin = min Zf (18) 

fee T/ 

54) Si is obtained by inactivating all the records in 77 with 
estimated metric less than Z m j n 


Si = k 


{k\k£%,zf > Z min } (19) 


Note that, initially. So = 0. 
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Fig. 5. FER performances under different decoding schemes. 


Fig. 6. Average computational complexity. 


C. Complexity 

The complexity of (CA-)SCL decoding consists of three 
parts: the path extension (includes the updating of path metrics 
0 ([5]) and the partial-sums), path metric sorting, path copy, 
and partial-sum updating. 

Applying pruning, many redundant path extensions along 
with path copies are avoided. Since the computational com¬ 
plexity to obtain a length-AT path is ()(N log A’) ATI , and 
in the best case only one path is preserved in the list, thus 
the computational complexity is reduced by ()(LN log N). 
However, calculating the threshold itself introduces additional 
compactions. For each information bit, the metrics of the 
survival paths and the pruned paths are added up to compute 
the threshold, thus the complexity increases with O(LN). 
Thus, the computational complexity can be reduced by order 
of 0(LN log N) if the P to i is set to a proper value. 

Moreover, when one of the two descendants of a single 
parent path is pruned, there is no longer need for the path 
copy operation. In fact, it is the usual case especially when 
the corresponding polarized channel is with high reliability. 
Therefore, the number of required path copies is also reduced. 

As to the path metric sorting, the least reliable paths are 
required to be picked out when computing the threshold, so 
the pruning does not reduce the sorting complexity. 

IV. Simulation Results 

In this section, we analyze the performance of the proposed 
pruned (CA-SCL) decoding algorithm via simulation. The 
simulated polar code has code length N = 1024 and the code 
rate R = 1/2, which is constructed under E^/Nq = 1.5dB 
using Gaussian Approximation m. The information block 
is assumed to have 16 embedded CRC bits, and CA-SCL 
decoding is applied. 

Fig. 0 shows the FER performances under different L 
values and pruning techniques. Fig. [6] and Fig. [7] show the 
corresponding average computational complexity and average 
number of path copies, respectively. The average computa¬ 
tional complexity is evaluated in terms of the number of metric 
recursive operations, which are defined in 0 and 0. Here, 



Fig. 7. Number of path copy operations. 


we pay more attention to the SNR region where the FER 
is around 0.1 ~ 0.001, which is the interesting Particularly, 
the thresholds of ‘sum statistical’ is obtained by Monte Carlo 
simulation. As shown in the figures, when P to i takes a relative 
conservative value (compared with the FER), that is 10 -5 , all 
the pruning technique do not introduce noticeable loss in FER; 
while the proposed scheme has much lower complexity than 
the existing scheme in lfl2l . When decoding with CA-SCL 
with L = 32 and P to i = FER, the performance is deteriorated 
and very close to standard CA-SCL with L = 16, while the 
average complexity is even lower than the standard one with 
L = 8. Further, when P to 1 =0.1 x FER, the FER performance 
loss is less than O.OldB, but the complexity is reduced by 
50% ~ 75%. 

Fig. [8] compares the FER and FER loss of the proposed 
pruning scheme and □a under different target losses P to \. 
The Ef,/N f) is fixed to 1.5dB. As shown in the figure, when 
P to 1 < FER, the actual FER loss is very close to the target 
P to i ; while the actual loss of lfl2ll is far less than the target. 
That means, compared with lfl2ll . the proposed pruning scheme 
utilizes the tolerant FER loss much more efficiently, thus it is 
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Fig. 8. FER performance deterioration comparison of the pruned decoding 
scheme. 

with lower complexity. 

V. Conclusion 

In this paper, a tree-pruning technique to reduce the com¬ 
plexity of (CA-)SCL is proposed. During the decoding pro¬ 
cess, the candidate paths with metric less than a threshold are 
directly deleted to avoid redundant path extensions. Based on 
the reliabilities of the information/frozen bits, an upper bound 
of the path metric is derived to estimate the deterioration 
brought by the pruning operation. Utilizing this bound, a 
dynamic thresholding technique is presented. Compared with 
a similar existing scheme [13, the new proposed scheme can 
make full use of the given tolerant performance deterioration, 
and is much more efficient. 
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